clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\003_SQ1_1948_2015.csv"
drop in 53/l
destring geofips, replace
replace geofips=geofips/1000
rename geofips statefips
drop region table linecode  industryclassification description geoname
replace v8="" if v8=="(NA)"
replace v9="" if v9=="(NA)"
replace v10="" if v10=="(NA)"
replace v11="" if v11=="(NA)"
replace v12="" if v12=="(NA)"
replace v13="" if v13=="(NA)"
replace v14="" if v14=="(NA)"
replace v15="" if v15=="(NA)"
destring v8 v9 v10 v11 v12 v13 v14 v15, replace
reshape long v, i(statefips) j(marker)
replace marker=marker-8
gen temp1=marker/4
gen temp2=round(temp1,1)
gen quar=temp1-temp2
recode quar (0=1) (.25=2) (-.5=3) (-.25=4)
drop temp1 temp2
gen temp1=quar
recode temp1 (2=0) (3=0) (4=0)
replace temp1=1948 if marker==0
tsset statefips marker
by statefips: gen year=sum(temp1)
drop marker temp1
rename v inc_q
order year quar statefips inc_q
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\014_SQ1_1948_2015.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\004_SA1_1929_2014.csv"
drop in 157/l
drop if linecode==3
destring geofips, replace
replace geofips=geofips/1000
rename geofips statefips
*pop is linecode=2, income is linecode=1.
drop  geoname region table industryclassification description
forvalues iteration=8(1)28 {
replace v`iteration'="" if v`iteration'=="(NA)"
destring v`iteration', replace
}

reshape long v, i(statefips linecode) j(year)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\015_SA1_1929_2014.dta", replace
drop if linecode==1
drop linecode
rename v pop_a
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\015_SA1_1929_2014.dta", clear
drop if linecode==2
drop linecode
rename v inc_a
merge 1:1 statefips year using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
*merge = 3 only
drop _merge
gen quar=2.5
replace year=year+1921
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\015_SA1_1929_2014.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\005_disposableincome.csv"
drop in 1/4
drop in 158/l
replace v1="" in 1
destring v1, replace
replace v1=v1/1000
rename v1 statefips
drop v2 v4
rename v3 linecode
drop in 1
keep if linecode=="51"
drop linecode
forvalues iteration=5(1)16 {
replace v`iteration'="" if v`iteration'=="(NA)"
destring v`iteration', replace
}

reshape long v, i(statefips) j(year)
replace year=year+1943
rename v disinc_a
gen quar=2.5
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\016_disposableincome.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\006_cpinat.csv"
drop v14 v15
drop in 1/11
rename v1 year
destring v2-v13, replace
reshape long v, i(year) j(month)
replace month=month-1
rename v cpinat_m
destring year, replace
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\017_cpinat.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\007_cpinortheast.csv"
drop v14 v15 v16
drop in 1/11
rename v1 year
destring v2-v13, replace
reshape long v, i(year) j(month)
replace month=month-1
rename v cpireg_m
gen region=4
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\018_cpinortheast.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\008_cpisouth.csv"
drop v14 v15 v16
drop in 1/11
rename v1 year
destring v2-v13, replace
reshape long v, i(year) j(month)
replace month=month-1
rename v cpireg_m
gen region=1
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\019_cpisouth.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\009_cpimidwest.csv"
drop v14 v15 v16
drop in 1/11
rename v1 year
destring v2-v13, replace
reshape long v, i(year) j(month)
replace month=month-1
rename v cpireg_m
gen region=3
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\020_cpimidwest.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\010_cpiwest.csv"
drop v14 v15 v16
drop in 1/11
rename v1 year
destring v2-v13, replace
reshape long v, i(year) j(month)
replace month=month-1
rename v cpireg_m
gen region=2
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\021_cpiwest.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\011_statehousingprices.txt"
rename v1 stateabrev
rename v2 year
rename v3 quar
rename v4 house_q
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\022_statehousingprices.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\012_gsp_naics_all_C.csv"
drop in 4681/l
destring geofips, replace
replace geofips=geofips/1000
rename geofips statefips
keep if industryid==1
drop  geoname region componentid componentname industryid industryclassification description
reshape long v, i(statefips) j(year)
rename v gspnaics_a
replace year=year+1988
gen quar=2.5
*gspnaics_a and gspsic_a aren't numeric.  
charlist gspnaics
*the above two lines indicate there's nothing weird in them.
destring gspnaics, replace
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\023_gsp_naics_all_C.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\013_gsp_sic_all_C.csv"
drop in 4057/l
destring geofips, replace
replace geofips=geofips/1000
rename geofips statefips
keep if industryid==1
drop  geoname region componentid componentname industryid industryclassification description
reshape long v, i(statefips) j(year)
rename v gspsic_a
replace year=year+1954
gen quar=2.5
*gspnaics_a and gspsic_a aren't numeric.  
charlist gspsic
*the above two lines indicate there's nothing weird in them.
destring gspsic, replace
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\024_gsp_sic_all_C.dta", replace

*The following creates a monthly skeleton file
*The skeleton file either has basic variables (year, quarter, stateno, statefips, etc.) or variables whose values are implied from information reported in the codebook (i.e., when different states� fiscal years are and when they changed).  
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\025Skeleton.csv"
recode quar (1=1) (2=4) (3=7) (4=10), gen(month)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
recode month (1=2) (4=5) (7=8) (10=11)
drop if month==2.5
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp2.dta", replace
recode month (2=3) (5=6) (8=9) (11=12)
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp2.dta"
*Now what fiscal year it is has to be fixed for the states with fiscal years that don't fit neatly onto quarters.  
*from the beginning of state gov finances (fy1942) to 1961, PA's fiscal year ended May 31.
*from the beginning of state gov finances (fy1941) to 2008, TX's fiscal year ended August 31.
*Those are the only two.  
replace fiscalyear=fiscalyear+1 if stateno==38&month==6&year<1961
replace fiscalyear=1942 if stateno==38&month==6&year==1941
replace fiscalyear=9999 if stateno==38&month==6&year==1961
replace fiscalyear=fiscalyear+1 if stateno==43&month==9
replace fiscalyear=1941 if stateno==43&year==1940&month==9
replace fiscalyear=. if stateno==43&year==2016&month==9
drop if quar==2.5
gen constant=1
egen constantsum=sum(constant), by(stateno fiscalyear)
sort stateno year month
replace constantsum=. if fiscalyear==.
tab constantsum
*Ater 12, the next number is 144.  There are values of 1, 3, 9 and 11 under 12.  
list stateno year month fiscalyear constantsum if constantsum<12
list stateno year month fiscalyear constantsum if fiscalyear==9999
*no instances of fiscalyear=9999 are separated in time.  
drop constant constantsum
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", replace

*PUT MONTHLY CPI DATA TOGETHER (APPEND REGIONAL CPI / MERGE WITH NAT CPI), AND IMPUTE MISSING VALUES.  
*Add cases to the regional CPI data
clear
use C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\018_cpinortheast.dta
*1966 is the first year
destring year, replace
expand 2 if year<1985
replace year=year-19 in 601/l
tab year
replace  cpireg_m=. if year<1966
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\018_cpinortheast.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\019_cpisouth.dta", clear
destring year, replace
expand 2 if year<1985
replace year=year-19 in 601/l
tab year
replace  cpireg_m=. if year<1966
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\019_cpisouth.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\020_cpimidwest.dta", clear
destring year, replace
expand 2 if year<1985
replace year=year-19 in 601/l
tab year
replace  cpireg_m=. if year<1966
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\020_cpimidwest.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\021_cpiwest.dta", clear
destring year, replace
expand 2 if year<1985
replace year=year-19 in 601/l
tab year
replace  cpireg_m=. if year<1966
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\021_cpiwest.dta", replace

*merge regional cpi data together with themselves and with national data
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\018_cpinortheast.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\019_cpisouth.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\020_cpimidwest.dta"
merge m:1 year month using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\017_cpinat.dta"
*merge=3 always
drop _merge

*the following imputes missing regional monthly data.  
sort region year month
*for all four regions, dec 1966 is the first month observed.  From there to Dec 1977, the last month of every quarter is the only 
*one observed.  From Feb 1978 to Dec 1986, every other month is observed.  Then every month is observed.  
*Creat variables that measures the average rate of monthly growth between every other month as well as between the last month of each quarter.
gen yearmonth=(year*100)+month
recode month (1=0) (2=1) (3=0) (4=1) (5=0) (6=1) (7=0) (8=1) (9=0) (10=1) (11=0) (12=1), gen(month2)
recode month (1/2=0) (3=1) (4/5=0) (6=1) (7/8=0) (9=1) (10/11=0) (12=1), gen(month3)
gen month2b=0
replace month2b=1 if year>1977&year<1987
gen month3b=0
replace month3b=1 if year>1966&year<1978
tsset region yearmonth
by region: gen cpiregl1=cpireg_m[_n-1]
by region: gen cpiregl2=cpireg_m[_n-2]
by region: gen cpiregl3=cpireg_m[_n-3]
gen chg1=cpireg_m/cpiregl1
gen chg2=cpireg_m/cpiregl2
gen chg3=cpireg_m/cpiregl3
replace chg2=. if month2==0
replace chg3=. if month3==0
by region: gen chg2lead1=chg2[_n+1]
by region: gen chg3lead1=chg3[_n+1]
by region: gen chg3lead2=chg3[_n+2]
replace chg2=chg2lead1 if chg2==.
replace chg3=chg3lead1 if chg3==.
replace chg3=chg3lead2 if chg3==.
replace chg2=. if month3b==1
replace chg3=. if month2b==1
gen chg2b=chg2^(1/2)
gen chg3b=chg3^(1/3)
*The following checks the work.
gen temp=chg2b*chg2b
reg temp chg2
gen temp2=temp-chg2
sum temp2
*close enough, rounding error makes it a little off.
drop temp temp2
*The following creates change in national cpi over one month
by region: gen cpinatl1=cpinat_m[_n-1]
gen chgnat=cpinat_m/cpinatl1
*now impute for the months that have every other month.  
reg chg1 chg2b chgnat
predict est2
*now adjust for too much change or not enough change.  You know what growth was over two months.
*So growth in the two individual months must equal what growth was in the two months together.  
by region: gen est2l1=est2[_n-1]
gen temp1=est2*est2l1
replace temp1=. if month2==0
*the below is how much over or under change there was in two months.
gen temp2=chg2/temp1
*the below brings the over or under change down so that it applies to one month's change.
gen temp3=temp2^(1/2)
*the adjuster is still only for every other month, so fill in missing months.
by region: gen temp3lead1=temp3[_n+1]
replace temp3=temp3lead1 if temp3==.
*now make the adjustment
gen chg1imp2=est2*temp3
replace chg1imp2=. if month3b==1
*Now check to see that change over time for those two imputed months equals actual change over time in two months.  
by region: gen chg1imp2l1=chg1imp2[_n-1]
gen temp4=chg1imp2*chg1imp2l1
replace temp4=. if month2==0
reg chg2 temp4 if month2==1
*There is some rounding error eight decimals out, but they are essentially the same number.  Everything checks out.  
drop  cpiregl1 cpiregl2 cpiregl3 chg2 chg2lead1 chg3lead1 chg3lead2 chg2b cpinatl1 est2 est2l1 temp1 temp2 temp3 temp3lead1 chg1imp2l1 temp4
*now impute for the months that have every third month observed.  
reg chg1 chg3b chgnat
predict est3
*now adjust for too much change or not enough change.  You know what growth was over three months.
*So growth in the three individual months must equal what growth was in the three months together.  
by region: gen est3l1=est3[_n-1]
by region: gen est3l2=est3[_n-2]
gen temp1=est3*est3l1*est3l2
replace temp1=. if month3==0
*the below is how much over or under change there was in the three months.
gen temp2=chg3/temp1
*the below brings the over or under change down so that it applies to one month's change.
gen temp3=temp2^(1/3)
*the adjuster is still only for every third month, so fill in missing months.
by region: gen temp3lead1=temp3[_n+1]
by region: gen temp3lead2=temp3[_n+2]
replace temp3=temp3lead1 if temp3==.
replace temp3=temp3lead2 if temp3==.
*now make the adjustment
gen chg1imp3=est3*temp3
replace chg1imp3=. if month2b==1
*Now check to see that change over time for those three imputed months equals actual change over time in three months.  
by region: gen chg1imp3l1=chg1imp3[_n-1]
by region: gen chg1imp3l2=chg1imp3[_n-2]
gen temp4=chg1imp3*chg1imp3l1*chg1imp3l2
replace temp4=. if month3==0
reg chg3 temp4 if month3==1
*There is some rounding error eight decimals out, but they are essentially the same number.  Everything checks out.  
drop  chg3 chg3b chgnat est3 est3l1 est3l2 temp1 temp2 temp3 temp3lead1 temp3lead2 chg1imp3l1 chg1imp3l2 temp4
*The following fills in the missing values for regional CPI from the imputed change variables.  
replace  chg1=chg1imp2 if chg1==.&month2b==1
replace  chg1=chg1imp3 if chg1==.&month3b==1
tab year if chg1==.
tab year if chg1!=.
*Everything checks out.
drop  chg1imp2 chg1imp3
tsset region yearmonth
by region: gen cpiregl1=cpireg_m[_n-1]
gen cpiregimp=chg1*cpiregl1
replace cpireg_m=cpiregimp if cpireg_m==.
drop cpiregl1 cpiregimp
by region: gen cpiregl1=cpireg_m[_n-1]
gen cpiregimp=chg1*cpiregl1
replace cpireg_m=cpiregimp if cpireg_m==.
drop cpiregl1 cpiregimp
*Check on work: If you do it again, it should come out the same.  
by region: gen cpiregl1=cpireg_m[_n-1]
gen cpiregimp=chg1*cpiregl1
gen temp=cpireg_m-cpiregimp
sum temp
*The differences are at five decimals out, at most (mean=almost zero, sd=.0000006), I guess that's rounding error.  I can live with that.  
drop  month2 month3 month2b month3b chg1 cpiregl1 cpiregimp temp
replace cpinat_m=cpinat_m/237.786
replace cpireg_m=cpireg_m/238.985
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\029CPINatReg.dta", replace

*Bring national CPI data into stateno=0 & bring monthly CPI data into the monthly skeleton file.  
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\017_cpinat.dta", clear
gen stateno=0
replace cpinat_m=cpinat_m/237.786
rename cpinat_m cpinat_mb
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", clear
merge m:1 year month region using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\029CPINatReg.dta"
drop _merge
drop yearmonth
merge m:1 year month stateno using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
tab stateno _merge
*no problems above.  only merged when stateno=0
drop _merge
replace cpinat_m=cpinat_mb if cpinat_m==.&stateno==0
drop cpinat_mb
rename cpinat_m cpinat
rename cpireg_m cpireg
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", replace

*The following gets the monthly unemployment data ready: unemploy_unadj and unemploy_adj
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\034UnemployUnadjM.txt
rename value unemploy_unadj
tab footnote_codes
tab period year if footnote_codes=="P"
*all for sep 2015, as expected.  
drop footnote_codes
gen temp1=substr(series_id,-13,12)
tab temp1
*entirely 000000000000, cool.  
drop temp1
gen statefips=substr(series_id,6,2)
destring statefips, replace
gen temp1=substr(series_id,-1,1)
keep if temp1=="3"
drop temp1
drop series_id
drop if period=="M13"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\035UnemployAdjM.txt
rename value unemploy_adj
tab year footnote_codes if footnote_codes!=""
*C means "corrected", so I'm not going to worry about it.  
drop footnote_codes
gen temp1=substr(series_id,-13,12)
tab temp1
*entirely 000000000000, cool.  
drop temp1
gen statefips=substr(series_id,6,2)
destring statefips, replace
gen temp1=substr(series_id,-1,1)
tab temp1
keep if temp1=="3"
drop temp1
drop series_id
merge 1:1 statefips year period using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
*perfect match
drop _merge
drop if statefips==72
gen month=substr(period,-2,2)
drop period
destring month, replace
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace

*Merge the unemployment data into the monthly file that has national and regional CPI in it.  
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", clear
merge 1:1 statefips year month using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
drop _merge

*Now change values to system missing if they aren't appropriate to aggregate into fiscal years.  
gen constant=1
gen temp1=1
replace temp1=0 if cpireg==.
gen temp2=1
replace temp2=0 if cpinat==.
gen temp3=1
replace temp3=0 if unemploy_adj==.
gen temp4=1
replace temp4=0 if unemploy_unadj==.
egen con0=sum(constant), by(stateno year fiscalyear)
egen con1=sum(temp1), by(stateno year fiscalyear)
egen con2=sum(temp2), by(stateno year fiscalyear)
egen con3=sum(temp3), by(stateno year fiscalyear)
egen con4=sum(temp4), by(stateno year fiscalyear)
gen dif1=con0-con1
gen dif2=con0-con2
gen dif3=con0-con3
gen dif4=con0-con4
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", replace

*Now collapse and save for merging with the main file.  
replace cpireg=. if dif1!=0
replace cpinat=. if dif2!=0
replace unemploy_adj=. if dif3!=0
replace unemploy_unadj=. if dif4!=0
replace cpireg=. if fiscalyear==.
replace cpinat=. if fiscalyear==.
replace unemploy_adj=. if fiscalyear==.
replace unemploy_unadj=. if fiscalyear==.
collapse (mean) cpireg cpinat unemploy_adj unemploy_unadj, by(stateno fiscalyear)
gen quar=2.5
rename fiscalyear year
drop if year==.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\026Skeleton_m.dta", clear
collapse (mean) cpireg cpinat unemploy_adj unemploy_unadj, by(stateno year quar)
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\025Skeleton.csv"
merge 1:1 stateno year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
tab year if _merge==2
*these are the six fiscalyear=9999 cases.  They should be dropped.  
drop if _merge==2
drop _merge

*MERGE IN QUARTERLY INCOME: inc
merge 1:1 statefips year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\014_SQ1_1948_2015.dta"
*no _merge=2
drop _merge

*MERGE IN ANNUAL INCOME AND POP: inc_a pop_a
merge 1:1 statefips year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\015_SA1_1929_2014.dta"
*no _merge=2
drop _merge

*MERGE IN DISPOSABLE INCOME
merge 1:1 statefips year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\016_disposableincome.dta"
*no _merge=2
drop _merge

*MERGE IN HOUSING PRICES
merge 1:1 stateabrev year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\022_statehousingprices.dta"
*no _merge=2
replace house_q=house_q/100
drop _merge

*MERGE IN GSP
merge 1:1 statefips year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\023_gsp_naics_all_C.dta"
*no _merge=2
drop _merge
merge 1:1 statefips year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\024_gsp_sic_all_C.dta"
*no _merge=2
drop _merge
*The year these two variables share indicate that they are highly correlated (R2=.999), when ignoring the U.S. as a whole (the outlier).  
reg gspnaics_a gspsic_a if gspsic_a<2000000000
sum gspnaics_a gspsic_a if gspnaics_a!=.&gspsic_a!=.
tab year if gspnaics_a!=.&gspsic_a!=.
*on average, gspnaics is 3.2% greater than gspsic for the year in which they were both measured.  That makes averaging them for the year they share 
*(1999) problematic.  
*For deflating state gov finances data, there is no choice but to average the two amounts for the year they overlap (1999).
*But if someone were to examine the impact of gsp growth on elections, they should compute change from 1998 to 1999 for gspsic, 
*and change from 1999 to 2000 for gspsic.  But they won't be able to do that using change in GSP between fiscal years, 
*only between calendar years.  
*Merge the two GSP variables together.  
gen gsp_a=.
replace gsp_a=gspnaics_a if gspsic_a==.
replace gsp_a=gspsic_a if gspnaics_a==.
gen temp=(gspnaics_a+gspsic_a)/2
replace gsp_a=temp if gspnaics_a!=.&gspsic_a!=.
drop temp
*Change gsp variables to $1,000s, not $1,000,000s.  
replace gsp_a=gsp_a*1000
drop gspnaics_a gspsic_a

*Create time variable that can be tsset.
recode quar (1=1) (2=2) (2.5=3) (3=4) (4=5), gen(temp)
gen yearquar=(year*10)+temp
drop temp

save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", replace

*unemploy_a
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\036Unemploy1957to1975_A.csv", comma
rename unemploy unemploy_a
drop state
gen quar=2.5
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace

*Merge in unemploy_a
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", clear
merge 1:1 stateno year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", replace

*Creates a variable with calendar year data for the following four variables.  
gen temp1=cpireg
gen temp2=cpinat
gen temp3=unemploy_unadj
gen temp4=house_q
forvalues iteration=1(1)4 {
replace temp`iteration'=. if quar==2.5
egen temp`iteration'mean=mean(temp`iteration'), by(stateno year)
gen tempb=0
replace tempb=1 if temp`iteration'!=.
egen tempsum=sum(tempb), by(stateno year)
replace temp`iteration'mean=. if tempsum!=4
replace temp`iteration'mean=. if quar!=2.5
drop temp`iteration' tempb tempsum
}
rename temp1mean cpireg_a
rename temp2mean cpinat_a
rename temp3mean unemploy_b
rename temp4mean house_a
replace unemploy_a=unemploy_b if unemploy_a==.
drop unemploy_b

*yyy run from here.  


*Interpolate quarterly data for the following variables: pop, disinc gsp
tsset stateno yearquar
gen temp1=pop_a
gen temp2=disinc_a
gen temp3=gsp_a
forvalues iteration=1(1)3 {
local blamo temp`iteration'
by stateno: gen templ1=`blamo'[_n-1]
by stateno: gen templ2=`blamo'[_n-2]
by stateno: gen templ3=`blamo'[_n-3]
by stateno: gen templ4=`blamo'[_n-4]
by stateno: gen templ5=`blamo'[_n-5]
gen temp=`blamo'
replace temp=templ1 if temp==.
replace temp=templ2 if temp==.
replace temp=templ3 if temp==.
replace temp=templ4 if temp==.
gen tempb=`blamo'/templ5
by stateno: gen temp2lead1=tempb[_n+1]
by stateno: gen temp2lead2=tempb[_n+2]
by stateno: gen temp2lead3=tempb[_n+3]
by stateno: gen temp2lead4=tempb[_n+4]
replace tempb=temp2lead1 if tempb==.
replace tempb=temp2lead2 if tempb==.
replace tempb=temp2lead3 if tempb==.
replace tempb=temp2lead4 if tempb==.
drop temp2lead1 temp2lead2 temp2lead3 temp2lead4
gen tempc=tempb
replace tempc=tempb^(1/8) if quar==3
replace tempc=tempb^(3/8) if quar==4
replace tempc=tempb^(5/8) if quar==1
replace tempc=tempb^(7/8) if quar==2
replace tempc=tempb^(8/8) if quar==2.5
*0 real changes made for the last one, good.  
*I'm just doing the following for a check on my work.  
replace temp=templ5 if quar==2.5
gen temp4=temp*tempc
gen temp5=`blamo'
gen tempdif=abs((temp4-temp5)/temp5)
sum tempdif
*differences 16 decimals out or less.  
replace temp4=. if quar==2.5
*Make the variable a round number.  
gen temp6=round(temp4,1)
replace `blamo'=temp6
drop  templ1 templ2 templ3 templ4 templ5 temp tempb tempc temp4 temp5 temp6 tempdif
}
rename temp1 pop
rename temp2 disinc
rename temp3 gsp

*Getting rid of _q suffixes, because you know which variables are quarterly right now.
rename inc_q inc
rename house_q house

*NOW AGGREGATE TO THE STATE FISCAL YEAR TO FILL IN QUAR=2.5 for the following variables.
*inc
*pop
*house
*disinc
*gsp
gen temp1=house
gen temp2=pop
gen temp3=inc
gen temp4=disinc
gen temp5=gsp

gen temp6=1
replace temp6=. if quar==2.5
egen temp6sum=sum(temp6), by(stateno fiscalyear)
*The following code takes into account the fact that PA (in the past) and TX have fiscalyears that don't end when quarters end.  
*to 1961, PA had a fiscal year that ended on may 31, 1961.  so q2 gets a weight of one-third in the coming fy, while the next q2 gets a weight of two-thirds at the end of the fy.  
*TX has a fy that ends Aug 31.  so q3 gets a weight of one-third in the coming fy, and q3 gets a weight of two-thirds at the end of the next fy.
*Since monthly data aren't available, the best you can do is weight the five quarters that appear in one fy appropriately.  
*it is somewhat awkward with PA in the early period, because lagging on year-quar will put the value I want to move from quar=3 into quar=2.5, but I want it in quar =2.
gen tempfy=fiscalyear
replace tempfy=fiscalyear+1 if quar==2&stateno==38&year<1960
*get PA's first quarter
replace tempfy=1942 if stateno==38&year==1941&quar==2
replace tempfy=fiscalyear+1 if quar==3&stateno==43
gen temp8=1
replace temp8=. if quar==2.5
egen temp8sum=sum(temp8), by(stateno tempfy)

forvalues iteration=1(1)5 {
local blamo temp`iteration'
*The following tells you how many cases in the fiscal year are non-missing
gen temp7=0
replace temp7=1 if temp`iteration'!=.
egen temp7sum=sum(temp7), by(stateno fiscalyear)
*The following is "0" when the number of non-missing cases in the fiscal year match the number of quarters in the fiscal year.  
gen tempdif1=temp7sum-temp6sum
*the following aggregates the substantive variable in question to the fiscalyear
egen tempmean1=mean(temp`iteration'), by(stateno fiscalyear)
replace tempmean1=. if fiscalyear==.
replace tempmean1=. if tempdif1!=0
*the following aggregates the substantive variable in question to the special fiscalyear for the states that don't have fys that end at the end of a quarter
egen tempmean2=mean(temp`iteration'), by(stateno tempfy)
replace tempmean2=. if tempfy==.
gen tempmean2sum=(tempmean2+(tempmean1*2))/3
egen temp7sumb=sum(temp7), by(stateno tempfy)
gen tempdif2=temp7sumb-temp8sum
*The following line has to be done here instead of above, or quar=2.5's value will go into the averaging of the fiscal year.  
replace temp`iteration'=tempmean1 if tempdif1==0&quar==2.5
replace temp`iteration'=tempmean2sum if quar==2.5&stateno==38&fiscalyear<1961&tempdif1==0&tempdif2==0
replace temp`iteration'=tempmean2sum if quar==2.5&stateno==43&tempdif1==0&tempdif2==0
replace temp`iteration'=round(temp`iteration',1)
gen tempdif=tempmean1-tempmean2sum
tab stateno if tempdif!=0&tempdif!=.
*good, only PA and Tx resulted from the above
drop  temp7 temp7sum tempdif1 tempmean1 tempmean2 tempmean2sum temp7sumb tempdif2 tempdif
}
replace house=temp1 if quar==2.5
replace pop=temp2 if quar==2.5
replace inc=temp3 if quar==2.5
replace disinc=temp4 if quar==2.5
replace gsp=temp5 if quar==2.5
drop  temp1 temp2 temp3 temp4 temp5 temp6 temp6sum tempfy temp8 temp8sum

save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", replace

*What if you create/interpolate quarterly income data from annual data in the same way, and compare it to the income data that was measured quarterly?
*Interpolate inc_a to quarters to compare.  
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
*interpolate annual income (inc_a) for comparison to inc.  
tsset stateno yearquar
local blamo inc
by stateno: gen templ1=`blamo'_a[_n-1]
by stateno: gen templ2=`blamo'_a[_n-2]
by stateno: gen templ3=`blamo'_a[_n-3]
by stateno: gen templ4=`blamo'_a[_n-4]
by stateno: gen templ5=`blamo'_a[_n-5]
gen temp=`blamo'_a
replace temp=templ1 if temp==.
replace temp=templ2 if temp==.
replace temp=templ3 if temp==.
replace temp=templ4 if temp==.
gen temp2=`blamo'_a/templ5
by stateno: gen temp2lead1=temp2[_n+1]
by stateno: gen temp2lead2=temp2[_n+2]
by stateno: gen temp2lead3=temp2[_n+3]
by stateno: gen temp2lead4=temp2[_n+4]
replace temp2=temp2lead1 if temp2==.
replace temp2=temp2lead2 if temp2==.
replace temp2=temp2lead3 if temp2==.
replace temp2=temp2lead4 if temp2==.
drop temp2lead1 temp2lead2 temp2lead3 temp2lead4
gen temp3=temp2
replace temp3=temp2^(1/8) if quar==3
replace temp3=temp2^(3/8) if quar==4
replace temp3=temp2^(5/8) if quar==1
replace temp3=temp2^(7/8) if quar==2
replace temp3=temp2^(8/8) if quar==2.5
*0 real changes made for the last one, good.  
*I'm just doing the following for a check on my work.  
replace temp=templ5 if quar==2.5
gen incb=temp*temp3
gen temp5=`blamo'_a
gen tempdif=abs((incb-temp5)/temp5)
sum tempdif
*differences 16 decimals out or less.  
replace incb=. if quar==2.5
*Make the variable a round number.  
replace incb=round(incb,1)
drop  templ1 templ2 templ3 templ4 templ5 temp temp2 temp3 temp5 tempdif
*Aggregate the interpolated quarterly inc data to the state fiscal year, and compare that.  
local blamo incb
egen tempmean1=mean(`blamo'), by(stateno fiscalyear)
replace tempmean1=. if fiscalyear==.
gen temp1=1 if `blamo'!=.
gen temp2=1
replace temp2=. if quar==2.5
egen temp1sum=sum(temp1), by(stateno fiscalyear)
egen temp2sum=sum(temp2), by(stateno fiscalyear)
gen tempdif1=temp1sum-temp2sum
drop temp1sum temp2sum
*The following code takes into account the fact that PA (in the past) and TX have fiscalyears that don't end when quarters end.  
*to 1961, PA had a fiscal year that ended on may 31, 1961.  so q2 gets a weight of one-third in the coming fy, while the next q2 gets a weight of two-thirds at the end of the fy.  
*TX has a fy that ends Aug 31.  so q3 gets a weight of one-third in the coming fy, and q3 gets a weight of two-thirds at the end of the next fy.
*Since monthly data aren't available, the best you can do is weight the five quarters that appear in one fy appropriately.  
*it is somewhat awkward with PA in the early period, because lagging on year-quar will put the value I want to move from quar=3 into quar=2.5, but I want it in quar =2.
gen tempfy1=fiscalyear
replace tempfy1=fiscalyear+1 if quar==2&stateno==38&year<1960
*get PA's first quarter
replace tempfy1=1942 if stateno==38&year==1941&quar==2
replace tempfy1=fiscalyear+1 if quar==3&stateno==43
egen tempmean2=mean(`blamo'), by(stateno tempfy1)
replace tempmean2=. if fiscalyear==.
gen tempmean2sum=(tempmean2+(tempmean1*2))/3
gen dif=tempmean1-tempmean2sum
gen problem=.
replace problem=1 if dif!=0&dif!=.
replace problem=. if stateno==38&fiscalyear<1961
replace problem=. if stateno==43
replace problem=. if dif>-.000000000001&dif<.000000000001
tab problem
*no observations, good.  
drop problem
egen temp1sum=sum(temp1), by(stateno tempfy1)
egen temp2sum=sum(temp2), by(stateno tempfy1)
gen tempdif2=temp1sum-temp2sum
*The following line has to be done here instead of above, or quar=2.5's value will go into the averaging of the fiscal year.  
replace `blamo'=tempmean1 if tempdif1==0&quar==2.5
replace `blamo'=tempmean2sum if quar==2.5&stateno==38&fiscalyear<1961&tempdif1==0&tempdif2==0
*the last line resulted in no changes, since gspnaics wasn't observed back then.  
replace `blamo'=tempmean2sum if quar==2.5&stateno==43&tempdif1==0&tempdif2==0
drop  tempmean1 temp1 temp2 tempdif1 tempfy1 tempmean2 tempmean2sum dif temp1sum temp2sum tempdif2
gen incdif=abs((inc-incb)/inc)
hist incdif if quar!=2.5
*some values are off by as much as 16%, but those are outliers.  
sum incdif if quar!=2.5
*on average, they are off by 2/3s of a % (.68%, SD=.87%).  
hist incdif if quar==2.5
*some values are off by as much as 7%, but those are outliers.  
sum incdif if quar==2.5
*on average, they are off by .39%, SD=.48%.  
clear


*STATE GOVERNMENT FINANCES DATA


clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\042_DebtandCashSecurities20150805.csv", comma
keep year4  statecode  totaldebtoutstanding
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\040_ExpendituresA20150805.csv", comma
keep year4 statecode  totalexpenditure generalexpenditure legislativetotalexp legislativecuropere26 legislativecapoutlay
merge 1:1 year4 statecode using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
*perfect merge
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\039_Revenues20150805.csv", comma
keep year4  statecode totalrevenue totalrevownsources generalrevenue genrevownsources totaltaxes propertytaxt01 totalgensalestaxt09 totalselectsalestax individualincometaxt40 corpnetincometaxt41 totalfedigrevenue
merge 1:1 year4 statecode using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
*perfect merge
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
gen stateno=.
replace stateno=8.5 if statecode==9
replace stateno=statecode if statecode<9
replace stateno=statecode-1 if statecode>9
drop statecode
order stateno
rename year4 fiscalyear
rename totalrevenue rev1
rename totalrevownsources revinternal1
rename generalrevenue genrev1
rename genrevownsources genrevinternal1
rename totaltaxes tax1
rename propertytaxt01 proptax1
rename totalgensalestaxt09 salestax1
rename totalselectsalestax salesselect1
rename individualincometaxt40 inctax1
rename corpnetincometaxt41 corptax1
rename totalfedigrevenue intergovrevfed1
rename totalexpenditure exp1
rename generalexpenditure genexp1
rename totaldebtoutstanding debt1
rename legislativetotalexp legtot1
rename legislativecuropere26 legop1
rename legislativecapoutlay legcap1
order  rev1 revinternal1 intergovrevfed1 genrev1 genrevinternal1 tax1 proptax1 salestax1 salesselect1 inctax1 corptax1 exp1 genexp1 debt1 legtot1 legop1 legcap1
charlist rev1
charlist revinternal1
charlist intergovrevfed1
charlist genrev1
charlist genrevinternal1
charlist tax1
charlist proptax1
charlist salestax1
charlist salesselect1
charlist inctax1
charlist corptax1
charlist exp1
charlist genexp1
charlist debt1
charlist legtot1
charlist legop1
charlist legcap1
destring rev1, replace ignore(,)
destring revinternal1, replace ignore(,)
destring intergovrevfed1, replace ignore(,)
destring genrev1, replace ignore(,)
destring genrevinternal1, replace ignore(,)
destring tax1, replace ignore(,)
destring proptax1, replace ignore(,)
destring salestax1, replace ignore(,)
destring salesselect1, replace ignore(,)
destring inctax1, replace ignore(,)
destring corptax1, replace ignore(,)
destring exp1, replace ignore(,)
destring genexp1, replace ignore(,)
destring debt1, replace ignore(,)
destring legtot1, replace ignore(,)
destring legop1, replace ignore(,)
destring legcap1, replace ignore(,)
tab rev1 if rev1<0
tab revinternal1 if revinternal1<0
tab intergovrevfed1 if intergovrevfed1<0
tab genrev1 if genrev1<0
tab genrevinternal1 if genrevinternal1<0
tab tax1 if tax1<0
tab proptax1 if proptax1<0
tab salestax1 if salestax1<0
tab salesselect1 if salesselect1<0
tab inctax1 if inctax1<0
tab corptax1 if corptax1<0
tab exp1 if exp1<0
tab genexp1 if genexp<0
tab debt1 if debt1<0
tab legtot1 if legtot1<0
tab legop1 if legop1<0
tab legcap1 if legcap1<0
*The only negative number was -11111.  
recode rev1 revinternal1 intergovrevfed1 genrev1 genrevinternal1 tax1 proptax1 salestax1 salesselect1 inctax1 corptax1 exp1 genexp1 debt1 legtot1 legop1 legcap1 (-11111=.)
sort fiscalyear stateno
*browse
*I saw the following weirdness.  I didn't see anything else weird.  
replace legop=. if fiscalyear<1977
replace legcap=. if fiscalyear<1977
sort stateno fiscalyear
*browse
*I didn't see anything else weird.  
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
*Now merge the variables from the Census database into the main file.  
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", clear
merge m:1 stateno fiscalyear using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
list fiscalyear stateno if _merge==2
*four cases for the US as a whole in early years, and six states from early years are _merge==2
*they can be deleted.
drop if _merge==2
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", replace

*Re-downloaded all annual state finances files from the Census Web site to make sure they were all up-to-date.  https://www.census.gov/govs/state/historical_data.html.
*I converted all Excel files to CVS files, and saved all those and all the text files files with the following format: 044StateGovFinances (which has 1992), all the way up to 065StateGovFinances (which has 2013).
*The 1999 Excel file has two pages, so I copy and pasted the second page into the file 051.  

*1992 to 1995 CVS files have the same format.  
clear
gen fiscalyear=.
gen stateno=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\067StateGovFinances1992to1995.dta", replace
forvalues iteration=44(1)47 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
keep v1 v3
gen temp=0
replace temp=1 if v1=="ALABAMA"
replace temp=1 if v1=="ALASKA"
replace temp=1 if v1=="ARIZONA"
replace temp=1 if v1=="ARKANSAS"
replace temp=1 if v1=="CALIFORNIA"
replace temp=1 if v1=="COLORADO"
replace temp=1 if v1=="CONNECTICUT"
replace temp=1 if v1=="DELAWARE"
replace temp=1 if v1=="FLORIDA"
replace temp=1 if v1=="GEORGIA"
replace temp=1 if v1=="HAWAII"
replace temp=1 if v1=="IDAHO"
replace temp=1 if v1=="ILLINOIS"
replace temp=1 if v1=="INDIANA"
replace temp=1 if v1=="IOWA"
replace temp=1 if v1=="KANSAS"
replace temp=1 if v1=="KENTUCKY"
replace temp=1 if v1=="LOUISIANA"
replace temp=1 if v1=="MAINE"
replace temp=1 if v1=="MARYLAND"
replace temp=1 if v1=="MASSACHUSETTS"
replace temp=1 if v1=="MICHIGAN"
replace temp=1 if v1=="MINNESOTA"
replace temp=1 if v1=="MISSISSIPPI"
replace temp=1 if v1=="MISSOURI"
replace temp=1 if v1=="MONTANA"
replace temp=1 if v1=="NEBRASKA"
replace temp=1 if v1=="NEVADA"
replace temp=1 if v1=="NEW HAMPSHIRE"
replace temp=1 if v1=="NEW JERSEY"
replace temp=1 if v1=="NEW MEXICO"
replace temp=1 if v1=="NEW YORK"
replace temp=1 if v1=="NORTH CAROLINA"
replace temp=1 if v1=="NORTH DAKOTA"
replace temp=1 if v1=="OHIO"
replace temp=1 if v1=="OKLAHOMA"
replace temp=1 if v1=="OREGON"
replace temp=1 if v1=="PENNSYLVANIA"
replace temp=1 if v1=="RHODE ISLAND"
replace temp=1 if v1=="SOUTH CAROLINA"
replace temp=1 if v1=="SOUTH DAKOTA"
replace temp=1 if v1=="TENNESSEE"
replace temp=1 if v1=="TEXAS"
replace temp=1 if v1=="UTAH"
replace temp=1 if v1=="VERMONT"
replace temp=1 if v1=="VIRGINIA"
replace temp=1 if v1=="WASHINGTON"
replace temp=1 if v1=="WEST VIRGINIA"
replace temp=1 if v1=="WISCONSIN"
replace temp=1 if v1=="WYOMING"
gen stateno=sum(temp)
drop temp
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\067StateGovFinances1992to1995.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\067StateGovFinances1992to1995.dta", replace
}

sort fiscalyear count
drop if v3==""
tab v1
*The following gets rid of state labels.  
gen constant=1
egen constantsum=sum(constant), by(v1)
drop if constantsum==4
drop constant constantsum
charlist v3
destring v3, replace ignore(, )
rename v3 amount
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\067StateGovFinances1992to1995.dta", replace

*1996 to 1997 CVS files have the same format.  
clear
gen fiscalyear=.
gen stateno=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\068StateGovFinances1996to1997.dta", replace
forvalues iteration=48(1)49 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
keep v1 v2
gen temp=0
replace temp=1 if v1=="Alabama"
replace temp=1 if v1=="Alaska"
replace temp=1 if v1=="Arizona"
replace temp=1 if v1=="Arkansas"
replace temp=1 if v1=="California"
replace temp=1 if v1=="Colorado"
replace temp=1 if v1=="Connecticut"
replace temp=1 if v1=="Delaware"
replace temp=1 if v1=="Florida"
replace temp=1 if v1=="Georgia"
replace temp=1 if v1=="Hawaii"
replace temp=1 if v1=="Idaho"
replace temp=1 if v1=="Illinois"
replace temp=1 if v1=="Indiana"
replace temp=1 if v1=="Iowa"
replace temp=1 if v1=="Kansas"
replace temp=1 if v1=="Kentucky"
replace temp=1 if v1=="Louisiana"
replace temp=1 if v1=="Maine"
replace temp=1 if v1=="Maryland"
replace temp=1 if v1=="Massachusetts"
replace temp=1 if v1=="Michigan"
replace temp=1 if v1=="Minnesota"
replace temp=1 if v1=="Mississippi"
replace temp=1 if v1=="Missouri"
replace temp=1 if v1=="Montana"
replace temp=1 if v1=="Nebraska"
replace temp=1 if v1=="Nevada"
replace temp=1 if v1=="New Hampshire"
replace temp=1 if v1=="New Jersey"
replace temp=1 if v1=="New Mexico"
replace temp=1 if v1=="New York"
replace temp=1 if v1=="North Carolina"
replace temp=1 if v1=="North Dakota"
replace temp=1 if v1=="Ohio"
replace temp=1 if v1=="Oklahoma"
replace temp=1 if v1=="Oregon"
replace temp=1 if v1=="Pennsylvania"
replace temp=1 if v1=="Rhode Island"
replace temp=1 if v1=="South Carolina"
replace temp=1 if v1=="South Dakota"
replace temp=1 if v1=="Tennessee"
replace temp=1 if v1=="Texas"
replace temp=1 if v1=="Utah"
replace temp=1 if v1=="Vermont"
replace temp=1 if v1=="Virginia"
replace temp=1 if v1=="Washington"
replace temp=1 if v1=="West Virginia"
replace temp=1 if v1=="Wisconsin"
replace temp=1 if v1=="Wyoming"
gen stateno=sum(temp)
drop temp
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\068StateGovFinances1996to1997.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\068StateGovFinances1996to1997.dta", replace
}

sort fiscalyear count
drop if v2==""
drop if v1==""
tab v1
charlist v2
destring v2, replace ignore( ,) force
rename v2 amount
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\068StateGovFinances1996to1997.dta", replace

*1998 CVS file only one with this format.  
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\050StateGovFinances.csv", comma
drop in 57/l
drop in 4/5
drop if v2==""
drop v53-v67
drop if v1=="  General expenditure, by function:"
rename v1 y1
gen count=_n
reshape long v, i(y1 count) j(stateno)
replace stateno=stateno-2
rename y1 v1
rename v amount
gen fiscalyear=1998
sort stateno count
order fiscalyear stateno count
destring amount, replace ignore(,)
*I'm doing this to make sure that count is standardized across years.
sort stateno count
replace count=_n
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\069StateGovFinances1998.dta", replace

*1999 to 2001 CVS files have the same format.  
clear
gen fiscalyear=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\070StateGovFinances1999to2001.dta", replace
forvalues iteration=51(1)53 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\070StateGovFinances1999to2001.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\070StateGovFinances1999to2001.dta", replace
}
drop if v2==""
drop in 101/103
drop in 51/53
drop in 1/3
drop v155-v256
compress v1
forvalues iteration=3(3)153 {
drop v`iteration'
}
forvalues iteration=4(3)154 {
drop v`iteration'
}
rename v1 y1
reshape long v, i(y1 fiscalyear count) j(stateno)
replace stateno=(stateno-2)/3
sort fiscalyear stateno count
rename y1 v1
rename v amount
charlist amount
replace amount="0" if amount=="-"
destring amount, replace ignore(,)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\070StateGovFinances1999to2001.dta", replace

*2002 to 2003 CVS files have the same format.  
clear
gen fiscalyear=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\071StateGovFinances2002to2003.dta", replace
forvalues iteration=54(1)55 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\071StateGovFinances2002to2003.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\071StateGovFinances2002to2003.dta", replace
}
drop if v2==""
drop in 50/52
drop in 1/3
drop v104-v205
compress v1
forvalues iteration=3(2)103 {
drop v`iteration'
}
rename v1 y1
reshape long v, i(y1 fiscalyear count) j(stateno)
replace stateno=(stateno-2)/2
sort fiscalyear stateno count
rename y1 v1
rename v amount
charlist amount
replace amount="0" if amount=="-"
destring amount, replace ignore(,)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\071StateGovFinances2002to2003.dta", replace

*2004 to 2011 CVS files have the same format.  
clear
gen fiscalyear=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\072StateGovFinances2004to2011.dta", replace
forvalues iteration=56(1)63 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\072StateGovFinances2004to2011.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\072StateGovFinances2004to2011.dta", replace
}
drop if v2==""
compress v1
drop in 323
drop in 277
drop in 231
drop in 185
drop in 139
drop in 93
drop in 47
drop in 1
drop v53
rename v1 y1
reshape long v, i(y1 fiscalyear count) j(stateno)
replace stateno=stateno-2
sort fiscalyear stateno count
rename y1 v1
rename v amount
charlist amount
replace amount="0" if amount=="-"
destring amount, replace ignore( (),)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\072StateGovFinances2004to2011.dta", replace

*2012 to 2013 CVS files have the same format.  
clear
gen fiscalyear=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\073StateGovFinances2012to2013.dta", replace
forvalues iteration=64(1)65 {
clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.csv", comma
gen fiscalyear=1948+`iteration'
gen count=_n
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\073StateGovFinances2012to2013.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\073StateGovFinances2012to2013.dta", replace
}
drop in 54/55
drop in 1
replace v2="00" if v3=="United States"
drop v1
rename v2 statefips
reshape long v, i(fiscalyear statefips count) j(cat)
gsort cat -statefips
gen cat2=""
replace cat2=v if statefips=="Id2"
gen count2=_n
tsset count2
forvalues iteration=1(1)102 {
gen cat2L=cat2[_n-1]
replace cat2=cat2L if cat2==""
drop cat2L
}
drop if statefips=="Id2"
drop if cat2=="Geography"
compress cat2
charlist v
destring v, replace
rename v amount
destring statefips, replace
drop cat count2
rename cat2 v1
drop count
gen stateno=statefips
replace stateno=statefips-1 if statefips>2&statefips<7
replace stateno=statefips-2 if statefips>7&statefips<11
replace stateno=statefips-3 if statefips>11&statefips<14
replace stateno=statefips-4 if statefips>14&statefips<43
replace stateno=statefips-5 if statefips>43&statefips<52
replace stateno=statefips-6 if statefips>52&statefips<57
drop statefips
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\073StateGovFinances2012to2013.dta", replace

*Now merge files 067 to 074
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\067StateGovFinances1992to1995.dta", clear
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\068StateGovFinances1996to1997.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\069StateGovFinances1998.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\070StateGovFinances1999to2001.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\071StateGovFinances2002to2003.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\072StateGovFinances2004to2011.dta"
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\073StateGovFinances2012to2013.dta"
sort fiscalyear
by fiscalyear: sum
*amount is fully observed for all years except for 1999 (which was expected) and 2013 (missing four cases), which was unexpected.  
*I checked the CVS file, those values are missing.  
replace v1=lower(v1)

*DROP UNWANTED CASES
drop if v1=="personal income (millions, calendar year 1998)"
drop if v1=="personal income (millions, calendar year 1999)"
drop if v1=="personal income (millions, calendar year 2000)"
drop if v1=="population (thousands)"
drop if v1=="population (thousands, 2003)"
drop if v1=="population (thousands, april 1, 2000)"
drop if v1=="population (thousands, july 1, 1999)"
drop if v1=="population (thousands, july 1, 2001)"
drop if v1=="population (thousands, july, 2002)"

*Standardize the categories.  
gen cat=v1
replace cat="debt at end of fiscal year" if cat=="debt outstanding, long term and short term"
replace cat="direct expenditure" if cat=="direct expenditure2"
replace cat="general expenditure, by function: - correction" if cat=="correction"
replace cat="general expenditure, by function: - education" if cat=="education"
replace cat="general expenditure, by function: - governmental administration" if cat=="government administration"
replace cat="general expenditure, by function: - health" if cat=="health"
replace cat="general expenditure, by function: - highways" if cat=="highways"
replace cat="general expenditure, by function: - hospitals" if cat=="hospitals"
replace cat="general expenditure, by function: - interest on general debt" if cat=="interest on general debt"
replace cat="general expenditure, by function: - natural resources" if cat=="natural resources"
replace cat="general expenditure, by function: - other and unallocable" if cat=="other and unallocable"
replace cat="general expenditure, by function: - parks and recreation" if cat=="parks and recreation"
replace cat="general expenditure, by function: - police protection" if cat=="police protection"
replace cat="general expenditure, by function: - public welfare" if cat=="public welfare"
replace cat="intergovernmental expenditure" if cat=="intergovernmental expenditure2"
replace cat="liquor stores expenditure" if cat=="liquor store expenditure"
replace cat="total expenditure - direct expenditure - assistance and subsidies" if cat=="assistance and subsidies"
replace cat="total expenditure - direct expenditure - capital outlay" if cat=="capital outlay"
replace cat="total expenditure - direct expenditure - current operations" if cat=="current operation"
replace cat="total expenditure - direct expenditure - insurance benefits and repayments" if cat=="insurance benefits and repayments"
replace cat="total expenditure - direct expenditure - interest on debt" if cat=="interest on debt"
replace cat="total expenditure - exhibit: salaries and wages" if cat=="exhibit:  salaries and wages"
replace cat="total expenditure - general expenditure" if cat=="general expenditure"
replace cat="total expenditure" if cat=="total expenditure2"
replace cat="total revenue - general revenue - current charges" if cat=="current charges"
replace cat="total revenue - general revenue - intergovernmental revenue" if cat=="intergovernmental revenue"
replace cat="total revenue - general revenue - miscellaneous general revenue" if cat=="miscellaneous general revenue"
replace cat="total revenue - general revenue - total taxes - all other taxes" if cat=="other taxes"
replace cat="total revenue - general revenue - total taxes - corporation net income taxes" if cat=="corporate income tax"
replace cat="total revenue - general revenue - total taxes - general sales and gross receipts taxes" if cat=="general sales"
replace cat="total revenue - general revenue - total taxes - individual income taxes" if cat=="individual income"
replace cat="total revenue - general revenue - total taxes - license taxes" if cat=="license taxes"
replace cat="total revenue - general revenue - total taxes - selective sales and gross receipts taxes" if cat=="selective sales"
replace cat="total revenue - general revenue - total taxes" if cat=="taxes"
replace cat="total revenue - general revenue" if cat=="general revenue"
replace cat="total revenue - insurance trust revenue (1)" if cat=="insurance trust revenue"
replace cat="total revenue - insurance trust revenue (1)" if cat=="insurance trust revenue (1)"
replace cat="total revenue - liquor stores revenue" if cat=="liquor store revenue"
replace cat="total revenue - utility revenue" if cat=="utility revenue"
replace cat="total expenditure - exhibit: salaries and wages" if cat=="exhibit: salaries and wages"
replace cat="total revenue - general revenue - total taxes - corporation net income taxes" if cat=="corporation net income"
replace cat="total revenue - general revenue - total taxes - individual income taxes" if cat=="individual income tax"
replace cat="total revenue - liquor stores revenue" if cat=="liquor stores revenue"
replace cat="general expenditure, by function: - governmental administration" if cat=="governmental administration"
*Assess changes made above
tab cat
tab cat if fiscalyear<2012
gen constant=1
egen constantsum=sum(constant), by(stateno fiscalyear cat)
tab constantsum
tab cat if constantsum==2
tab fiscalyear if constantsum==2
sort fiscalyear stateno count
gen count2=_n
egen temp=min(count2), by(stateno fiscalyear cat)
gen temp2=count2-temp
drop count2
gen dum=1
replace dum=2 if temp2!=0
drop temp temp2
egen constantsum2=sum(constant), by(stateno fiscalyear cat dum)
tab constantsum2
*All 1,  good.
drop constantsum2
*THE FOLLOWING DEALS WITH NON-OBVIOUS CATS FOR STANDARDIZATION
*STANDARDIZING DIRECT EXPENDITURES CATS
egen temp=min(amount), by(stateno fiscalyear cat)
gen temp2=amount-temp
gen smallest=0
replace smallest=1 if temp2==0
drop temp temp2
tab dum smallest if constantsum==2&cat=="direct expenditure"
*dum=2 is the smallest 990 times out of 1016 times.  The other 26 times are the missing values in 1999, so the theory is supported.
*This implies the following change.  
replace cat="total expenditure - direct expenditure" if cat=="direct expenditure"&dum==1
replace cat="total expenditure - general expenditure - direct general expenditure" if cat=="direct expenditure"&dum==2
*STANDARDIZING TOTAL EXPENDITURES CATS
tab dum smallest if constantsum==2&cat=="total expenditure"
*Except for 26 cases (*2), which are the missing cases for this variable,
*dum=1 and dum=2 are always the smallest.  This implies that the values for amount are identical for this variable 
*for dum=1 and dum=2 in the same stateno and fiscalyear.  
egen minamount=min(amount), by(stateno fiscalyear cat)
egen maxamount=max(amount), by(stateno fiscalyear cat)
gen difamount=minamount-maxamount
tab difamount if constantsum==2&cat=="total expenditure"
*Yes, they are always identical.  1980 cases.  
tab dum if cat=="total expenditure"
*dum=1 is 1118 cases, dum=2 is 1016 cases.  You can get rid of dum=2.
drop if cat=="total expenditure"&dum==2
*STANDARDIZING INTERGOV EXPENDITURES CATS
tab dum smallest if constantsum==2&cat=="intergovernmental expenditure"
*These appear to always be the same as well.  
tab difamount if constantsum==2&cat=="intergovernmental expenditure"
*Yes, same for 1980 cases.  52 cases aren't the same, but those are system missing.  
*What about the difference between 
*cat=total expenditure - general expenditure - intergovernmental general expenditure
*and
*cat=total expenditure - intergovernmental expenditure
*Within one state and year?
*These values only exist in 2012 and 2013.  
gen temp=0
replace temp=1 if cat=="total expenditure - general expenditure - intergovernmental general expenditure"
replace temp=1 if cat=="total expenditure - intergovernmental expenditure"
egen minamount2=min(amount), by(stateno fiscalyear temp)
egen maxamount2=max(amount), by(stateno fiscalyear temp)
gen difamount2=minamount-maxamount
tab difamount2 if temp==1
*They are never different.
*To make the relationships between the categories as clear as possible, I'm going to leave the pairs of identical cases under the 
*two different category headings in, and make them consistent over time.  
replace cat="total expenditure - intergovernmental expenditure" if cat=="intergovernmental expenditure"&dum==1
replace cat="total expenditure - general expenditure - intergovernmental general expenditure" if cat=="intergovernmental expenditure"&dum==2
tab cat
*Every cat from the 1992-2011 period has a 1118 total for 1992-2013.  That's good.  
*There are some cats from 2012 to 2013 that were only observed in those years.  
drop  constantsum dum smallest minamount maxamount difamount temp minamount2 maxamount2 difamount2 constant
rename v1 catoriginal
sort fiscalyear stateno cat
*From the documentation: �Numbers like "-11111", "-11333", and "-55555" are flags meaning that the data item was not published in year cited (see User Guide for details).�
tab fiscalyear if amount==-11111
tab fiscalyear if amount==-11333
tab fiscalyear if amount==-55555
*none of those values existed, good.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\074StateGovFinances1992to2013.dta", replace

*The following creates state finance variables comparable to those already in the base file, at UOA=stateno-fiscalyear
*and merges them into the base dataset.  
gen rev2=0
gen genrev2=0
gen intergovrev2=0
gen intergovrevlocal2=0
gen intergovrevfed2=0
gen tax2=0
gen salestax2=0
gen salesselect2=0
gen inctax2=0
gen corptax2=0
gen exp2=0
gen genexp2=0
gen debt2=0
replace rev2=amount if cat=="total revenue"
replace genrev2=amount if cat=="total revenue - general revenue"
replace intergovrev2=amount if cat=="total revenue - general revenue - intergovernmental revenue"
replace intergovrevlocal2=amount if cat=="total revenue - general revenue - intergovernmental revenue - from local"
replace intergovrevfed2=amount if cat=="total revenue - general revenue - intergovernmental revenue - from federal"
replace tax2=amount if cat=="total revenue - general revenue - total taxes"
replace salestax2=amount if cat=="total revenue - general revenue - total taxes - general sales and gross receipts taxes"
replace salesselect2=amount if cat=="total revenue - general revenue - total taxes - selective sales and gross receipts taxes"
replace inctax2=amount if cat=="total revenue - general revenue - total taxes - individual income taxes"
replace corptax2=amount if cat=="total revenue - general revenue - total taxes - corporation net income taxes"
replace exp2=amount if cat=="total expenditure"
replace genexp2=amount if cat=="total expenditure - general expenditure"
replace debt2=amount if cat=="debt at end of fiscal year"
collapse (sum) rev2 genrev2 intergovrev2 intergovrevlocal2 intergovrevfed2 tax2 salestax2 salesselect2 inctax2 corptax2 exp2 genexp2 debt2, by(stateno fiscalyear)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\037BaseDataset.dta", clear
merge m:1 stateno fiscalyear using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\075BaseDataset.dta", replace

*THE FOLLOWING GETS ALL THE ANNUAL STATE GOV FINANCE TEXT FILES CONVERTED TO VARIABLES REPRESENTING A 
*SPECIFIC STATE GOV FINANCE CATEGORY AND MERGES THEM WITH THE MAIN FILE.
clear
gen v1=""
gen fiscalyear=.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta", replace
forvalues iteration=48(1)65 {
clear
insheet using C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\0`iteration'StateGovFinances.txt
gen fiscalyear=1948+`iteration'
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta"
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta", replace
}
gen statecode=substr(v1,1,2)
destring statecode, replace
gen stateno=.
replace stateno=8.5 if statecode==9
replace stateno=statecode if statecode<9
replace stateno=statecode-1 if statecode>9
drop statecode
gen temp=substr(v1,-18,12)
destring temp, gen(amount) force
gen temp2=0
replace temp2=1 if amount==.
tab fiscalyear temp2
*fiscalyear=1999 has all the characters in it, that's what the problem is.  
*It has a totally different format.  
gen temp3=substr(v1,-10,10)
destring temp3, gen(temp4) force
replace amount=temp4 if temp2==1
drop temp temp3 temp4
gen code=substr(v1,15,3)
gen temp=substr(v1,-13,3)
replace code=temp if temp2==1
drop temp2 temp
order  fiscalyear stateno code amount
rename v1 original
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta", replace
gen constant=1
egen constantsum=sum(constant), by(fiscalyear stateno code)
tab constantsum
*all 1, good.
clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta", clear
tab fiscalyear if amount==-11111
tab fiscalyear if amount==-11333
tab fiscalyear if amount==-55555
*none of those values existed, good.
clear

*What the following three character codes stand for is in the Census documentation.  
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\076StateGovFinanceItemCodes1996to2013.dta", clear
gen proptax3=0
replace proptax3=1 if code=="T01"
gen legop3=0
replace legop3=1 if code=="E26"
gen legconstr3=0
replace legconstr3=1 if code=="F26"
gen legcapexp3=0
replace legcapexp3=1 if code=="G26"
gen legequip3=0
replace legequip3=1 if code=="K26"
replace proptax3=amount if proptax3==1
replace legop3=amount if legop3==1
replace legconstr3=amount if legconstr3==1
replace legcapexp3=amount if legcapexp3==1
replace legequip3=amount if legequip3==1
*Now collapse & merge with main file.  
collapse (sum)  proptax3 legop3 legconstr3 legcapexp3 legequip3, by( stateno fiscalyear)
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
*Now merge them into the base file.  
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\075BaseDataset.dta", clear
merge m:1 stateno fiscalyear using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", replace

*See the file 078RevisionDatesOfCensusFiles.xlsx for a justification of why I used either the state government finance variables with a �1� suffix or a �2� (and sometimes �3�) suffix.  I used 2 (or 3 when appropriate) for fy2002 and on, and 1 for fy2001 and before.  

*COMPARE THE LEGISLATIVE SPENDING COMPONENTS TO FIGURE OUT HOW TO MAKE TOTAL SPENDING ON THE STATE LEGISLATURE.  
clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
gen legopdif=(legop1-legop3)/legop1
recode legopdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legopdif
*identical in all years.
gen legcapdif=(legcap1-(legconstr3+legcapexp3))/legcap1
recode legcapdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legcapdif
*identical in all years.  
gen legtot1dif1=(legtot1-(legop1+legcap1))/legtot1
recode legtot1dif1 (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legtot1dif1
*identical in all years.  
*A total isn't reported for #3.  
*But from the above, it is apparent that the total is found by merely adding the following three components; 
*legop3, legconstr3 and legcapexp3.  legequip is included in some of those.  
gen legtotdif1=(legop3+legconstr3+legcapexp3+legequip3-legtot1)/legtot1
recode legtotdif1 (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legtotdif1
*The four components do not equal legtot1, implying that the above is correct.  
gen legtotdif2=(legop3+legconstr3+legcapexp3-legtot1)/legtot1
recode legtotdif2 (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legtotdif2
*The three components always equal legtot1, implying the above is correct.  
gen legtotdif3=((legop3-legequip3)+legconstr3+legcapexp3-legtot1)/legtot1
recode legtotdif3 (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
tab fiscalyear legtotdif3
*The above didn't result in all 0s, which is consistent with what I concluded above.  
clear

clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
gen rev=.
gen revfed=.
gen revnofed=.
gen genrev=.
gen tax=.
gen proptax=.
gen salestax=.
gen salesselect=.
gen inctax=.
gen corptax=.
gen exp=.
gen genexp=.
gen debt=.
gen legtot=.
gen legop=.
replace rev=rev1 if fiscalyear<2002
replace revfed=intergovrevfed1 if fiscalyear<2002
replace revnofed=rev1-intergovrevfed1 if fiscalyear<2002
replace genrev=genrev1 if fiscalyear<2002
replace tax=tax1 if fiscalyear<2002
replace proptax=proptax1 if fiscalyear<2002
replace salestax=salestax1 if fiscalyear<2002
replace salesselect=salesselect1 if fiscalyear<2002
replace inctax=inctax1 if fiscalyear<2002
replace corptax=corptax1 if fiscalyear<2002
replace exp=exp1 if fiscalyear<2002
replace genexp=genexp1 if fiscalyear<2002
replace debt=debt1 if fiscalyear<2002
replace legtot=legtot1 if fiscalyear<2002
replace legop=legop1 if fiscalyear<2002
replace rev=rev2 if fiscalyear>2001
replace revfed=intergovrevfed2 if fiscalyear>2001
replace revnofed=rev2-intergovrevfed2 if fiscalyear>2001
replace genrev=genrev2 if fiscalyear>2001
replace tax=tax2 if fiscalyear>2001
replace proptax=proptax3 if fiscalyear>2001
replace salestax=salestax2 if fiscalyear>2001
replace salesselect=salesselect2 if fiscalyear>2001
replace inctax=inctax2 if fiscalyear>2001
replace corptax=corptax2 if fiscalyear>2001
replace exp=exp2 if fiscalyear>2001
replace genexp=genexp2 if fiscalyear>2001
replace debt=debt2 if fiscalyear>2001
replace legtot=legop3+legconstr3+legcapexp3 if fiscalyear>2001
replace legop=legop3 if fiscalyear>2002
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", replace

*WHAT % OF SPENDING ON THE STATE LEGISLATURE IS CURRENT OPERATING EXPENDITURES?
clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
gen legopper=(legop/legtot)*100
sum legopper
sort stateno
by stateno: sum legopper 
clear
*ave = 96.9
*states range from 93 to 99%.
*This distinction can probably be ignored.  There are probably some years when spending on the state legislature spikes because of construction, so that might be problematic.  

clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
drop rev1 revinternal1 intergovrevfed1 genrev1 genrevinternal1 tax1 proptax1 salestax1 salesselect1 inctax1 corptax1 exp1 genexp1 debt1 legtot1 legop1 legcap1 ///
rev2 genrev2 intergovrev2 intergovrevlocal2 intergovrevfed2 tax2 salestax2 salesselect2 inctax2 corptax2 exp2 genexp2 debt2 ///
proptax3 legop3 legconstr3 legcapexp3 legequip3
gen surplus=rev-exp
gen gensurplus=genrev-genexp

sort stateno year quar
order ///
statename ///
stateabrev ///
statefips ///
stateno ///
stateicpsr ///
region ///
nationdum ///
isastate ///
year ///
quar ///
yearquar ///
isaquar ///
fyendmonth ///
fyendday ///
fyendquar ///
fiscalyear ///
fyirregular ///
fyshort ///
oddevenyear ///
electyearbien ///
pop ///
pop_a ///
cpinat ///
cpinat_a ///
cpireg ///
cpireg_a ///
house ///
house_a ///
inc ///
inc_a ///
disinc ///
disinc_a ///
gsp ///
gsp_a ///
unemploy_unadj ///
unemploy_adj ///
unemploy_a ///
rev ///
revfed ///
revnofed ///
genrev ///
tax ///
proptax ///
salestax ///
salesselect ///
inctax ///
corptax ///
exp ///
genexp ///
surplus ///
gensurplus ///
debt ///
legtot
drop legop
rename legtot leg

*THE FOLLOWING CREATES VARIABLES FROM OTHER VARIABLES
gen real1incpc=(inc/cpinat)/pop
gen real1gsppc=(gsp/cpinat)/pop
gen real1revpc=(rev/cpinat)/pop
gen real1revnofedpc=(revnofed/cpinat)/pop
gen real1genrevpc=(genrev/cpinat)/pop
gen real1taxpc=(tax/cpinat)/pop
gen real1proptaxpc=(proptax/cpinat)/pop
gen real1salestaxpc=(salestax/cpinat)/pop
gen real1salesselectpc=(salesselect/cpinat)/pop
gen real1inctaxpc=(inctax/cpinat)/pop
gen real1corptaxpc=(corptax/cpinat)/pop
gen real1exppc=(exp/cpinat)/pop
gen real1genexppc=(genexp/cpinat)/pop
gen real1surpluspc=(surplus/cpinat)/pop
gen real1gensurpluspc=(gensurplus/cpinat)/pop
gen real1debtpc=(debt/cpinat)/pop
gen real1legpc=(leg/cpinat)/pop
gen real2incpc=(inc/cpireg)/pop
gen real2gsppc=(gsp/cpireg)/pop
gen real2revpc=(rev/cpireg)/pop
gen real2revnofedpc=(revnofed/cpireg)/pop
gen real2genrevpc=(genrev/cpireg)/pop
gen real2taxpc=(tax/cpireg)/pop
gen real2proptaxpc=(proptax/cpireg)/pop
gen real2salestaxpc=(salestax/cpireg)/pop
gen real2salesselectpc=(salesselect/cpireg)/pop
gen real2inctaxpc=(inctax/cpireg)/pop
gen real2corptaxpc=(corptax/cpireg)/pop
gen real2exppc=(exp/cpireg)/pop
gen real2genexppc=(genexp/cpireg)/pop
gen real2surpluspc=(surplus/cpireg)/pop
gen real2gensurpluspc=(gensurplus/cpireg)/pop
gen real2debtpc=(debt/cpireg)/pop
gen real2legpc=(leg/cpireg)/pop
gen revpinc=(rev/inc)*100
gen revnofedpinc=(revnofed/inc)*100
gen genrevpinc=(genrev/inc)*100
gen taxpinc=(tax/inc)*100
gen proptaxpinc=(proptax/inc)*100
gen salestaxpinc=(salestax/inc)*100
gen salesselectpinc=(salesselect/inc)*100
gen inctaxpinc=(inctax/inc)*100
gen corptaxpinc=(corptax/inc)*100
gen exppinc=(exp/inc)*100
gen genexppinc=(genexp/inc)*100
gen surpluspinc=(surplus/inc)*100
gen gensurpluspinc=(gensurplus/inc)*100
gen debtpinc=(debt/inc)*100
gen legpinc=(leg/inc)*100
gen revpgsp=(rev/gsp)*100
gen revnofedpgsp=(revnofed/gsp)*100
gen genrevpgsp=(genrev/gsp)*100
gen taxpgsp=(tax/gsp)*100
gen proptaxpgsp=(proptax/gsp)*100
gen salestaxpgsp=(salestax/gsp)*100
gen salesselectpgsp=(salesselect/gsp)*100
gen inctaxpgsp=(inctax/gsp)*100
gen corptaxpgsp=(corptax/gsp)*100
gen exppgsp=(exp/gsp)*100
gen genexppgsp=(genexp/gsp)*100
gen surpluspgsp=(surplus/gsp)*100
gen gensurpluspgsp=(gensurplus/gsp)*100
gen debtpgsp=(debt/gsp)*100
gen legpgsp=(leg/gsp)*100
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", replace

*VARIABLES COMPUTED AS CHANGES
*LAGGING VARIABLES
keep if quar==2.5
tsset stateno year
by stateno: gen real1incpcL=real1incpc[_n-1]
by stateno: gen real1gsppcL=real1gsppc[_n-1]
by stateno: gen real1revpcL=real1revpc[_n-1]
by stateno: gen real1revnofedpcL=real1revnofedpc[_n-1]
by stateno: gen real1genrevpcL=real1genrevpc[_n-1]
by stateno: gen real1taxpcL=real1taxpc[_n-1]
by stateno: gen real1proptaxpcL=real1proptaxpc[_n-1]
by stateno: gen real1salestaxpcL=real1salestaxpc[_n-1]
by stateno: gen real1salesselectpcL=real1salesselectpc[_n-1]
by stateno: gen real1inctaxpcL=real1inctaxpc[_n-1]
by stateno: gen real1corptaxpcL=real1corptaxpc[_n-1]
by stateno: gen real1exppcL=real1exppc[_n-1]
by stateno: gen real1genexppcL=real1genexppc[_n-1]
by stateno: gen real1surpluspcL=real1surpluspc[_n-1]
by stateno: gen real1gensurpluspcL=real1gensurpluspc[_n-1]
by stateno: gen real1debtpcL=real1debtpc[_n-1]
by stateno: gen real1legpcL=real1legpc[_n-1]
by stateno: gen real2incpcL=real2incpc[_n-1]
by stateno: gen real2gsppcL=real2gsppc[_n-1]
by stateno: gen real2revpcL=real2revpc[_n-1]
by stateno: gen real2revnofedpcL=real2revnofedpc[_n-1]
by stateno: gen real2genrevpcL=real2genrevpc[_n-1]
by stateno: gen real2taxpcL=real2taxpc[_n-1]
by stateno: gen real2proptaxpcL=real2proptaxpc[_n-1]
by stateno: gen real2salestaxpcL=real2salestaxpc[_n-1]
by stateno: gen real2salesselectpcL=real2salesselectpc[_n-1]
by stateno: gen real2inctaxpcL=real2inctaxpc[_n-1]
by stateno: gen real2corptaxpcL=real2corptaxpc[_n-1]
by stateno: gen real2exppcL=real2exppc[_n-1]
by stateno: gen real2genexppcL=real2genexppc[_n-1]
by stateno: gen real2surpluspcL=real2surpluspc[_n-1]
by stateno: gen real2gensurpluspcL=real2gensurpluspc[_n-1]
by stateno: gen real2debtpcL=real2debtpc[_n-1]
by stateno: gen real2legpcL=real2legpc[_n-1]
by stateno: gen revpincL=revpinc[_n-1]
by stateno: gen revnofedpincL=revnofedpinc[_n-1]
by stateno: gen genrevpincL=genrevpinc[_n-1]
by stateno: gen taxpincL=taxpinc[_n-1]
by stateno: gen proptaxpincL=proptaxpinc[_n-1]
by stateno: gen salestaxpincL=salestaxpinc[_n-1]
by stateno: gen salesselectpincL=salesselectpinc[_n-1]
by stateno: gen inctaxpincL=inctaxpinc[_n-1]
by stateno: gen corptaxpincL=corptaxpinc[_n-1]
by stateno: gen exppincL=exppinc[_n-1]
by stateno: gen genexppincL=genexppinc[_n-1]
by stateno: gen surpluspincL=surpluspinc[_n-1]
by stateno: gen gensurpluspincL=gensurpluspinc[_n-1]
by stateno: gen debtpincL=debtpinc[_n-1]
by stateno: gen legpincL=legpinc[_n-1]
by stateno: gen revpgspL=revpgsp[_n-1]
by stateno: gen revnofedpgspL=revnofedpgsp[_n-1]
by stateno: gen genrevpgspL=genrevpgsp[_n-1]
by stateno: gen taxpgspL=taxpgsp[_n-1]
by stateno: gen proptaxpgspL=proptaxpgsp[_n-1]
by stateno: gen salestaxpgspL=salestaxpgsp[_n-1]
by stateno: gen salesselectpgspL=salesselectpgsp[_n-1]
by stateno: gen inctaxpgspL=inctaxpgsp[_n-1]
by stateno: gen corptaxpgspL=corptaxpgsp[_n-1]
by stateno: gen exppgspL=exppgsp[_n-1]
by stateno: gen genexppgspL=genexppgsp[_n-1]
by stateno: gen surpluspgspL=surpluspgsp[_n-1]
by stateno: gen gensurpluspgspL=gensurpluspgsp[_n-1]
by stateno: gen debtpgspL=debtpgsp[_n-1]
by stateno: gen legpgspL=legpgsp[_n-1]
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace

clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
drop if quar==2.5
tsset stateno yearquar
by stateno: gen real1incpcL=real1incpc[_n-1]
by stateno: gen real1gsppcL=real1gsppc[_n-1]
by stateno: gen real1revpcL=real1revpc[_n-1]
by stateno: gen real1revnofedpcL=real1revnofedpc[_n-1]
by stateno: gen real1genrevpcL=real1genrevpc[_n-1]
by stateno: gen real1taxpcL=real1taxpc[_n-1]
by stateno: gen real1proptaxpcL=real1proptaxpc[_n-1]
by stateno: gen real1salestaxpcL=real1salestaxpc[_n-1]
by stateno: gen real1salesselectpcL=real1salesselectpc[_n-1]
by stateno: gen real1inctaxpcL=real1inctaxpc[_n-1]
by stateno: gen real1corptaxpcL=real1corptaxpc[_n-1]
by stateno: gen real1exppcL=real1exppc[_n-1]
by stateno: gen real1genexppcL=real1genexppc[_n-1]
by stateno: gen real1surpluspcL=real1surpluspc[_n-1]
by stateno: gen real1gensurpluspcL=real1gensurpluspc[_n-1]
by stateno: gen real1debtpcL=real1debtpc[_n-1]
by stateno: gen real1legpcL=real1legpc[_n-1]
by stateno: gen real2incpcL=real2incpc[_n-1]
by stateno: gen real2gsppcL=real2gsppc[_n-1]
by stateno: gen real2revpcL=real2revpc[_n-1]
by stateno: gen real2revnofedpcL=real2revnofedpc[_n-1]
by stateno: gen real2genrevpcL=real2genrevpc[_n-1]
by stateno: gen real2taxpcL=real2taxpc[_n-1]
by stateno: gen real2proptaxpcL=real2proptaxpc[_n-1]
by stateno: gen real2salestaxpcL=real2salestaxpc[_n-1]
by stateno: gen real2salesselectpcL=real2salesselectpc[_n-1]
by stateno: gen real2inctaxpcL=real2inctaxpc[_n-1]
by stateno: gen real2corptaxpcL=real2corptaxpc[_n-1]
by stateno: gen real2exppcL=real2exppc[_n-1]
by stateno: gen real2genexppcL=real2genexppc[_n-1]
by stateno: gen real2surpluspcL=real2surpluspc[_n-1]
by stateno: gen real2gensurpluspcL=real2gensurpluspc[_n-1]
by stateno: gen real2debtpcL=real2debtpc[_n-1]
by stateno: gen real2legpcL=real2legpc[_n-1]
by stateno: gen revpincL=revpinc[_n-1]
by stateno: gen revnofedpincL=revnofedpinc[_n-1]
by stateno: gen genrevpincL=genrevpinc[_n-1]
by stateno: gen taxpincL=taxpinc[_n-1]
by stateno: gen proptaxpincL=proptaxpinc[_n-1]
by stateno: gen salestaxpincL=salestaxpinc[_n-1]
by stateno: gen salesselectpincL=salesselectpinc[_n-1]
by stateno: gen inctaxpincL=inctaxpinc[_n-1]
by stateno: gen corptaxpincL=corptaxpinc[_n-1]
by stateno: gen exppincL=exppinc[_n-1]
by stateno: gen genexppincL=genexppinc[_n-1]
by stateno: gen surpluspincL=surpluspinc[_n-1]
by stateno: gen gensurpluspincL=gensurpluspinc[_n-1]
by stateno: gen debtpincL=debtpinc[_n-1]
by stateno: gen legpincL=legpinc[_n-1]
by stateno: gen revpgspL=revpgsp[_n-1]
by stateno: gen revnofedpgspL=revnofedpgsp[_n-1]
by stateno: gen genrevpgspL=genrevpgsp[_n-1]
by stateno: gen taxpgspL=taxpgsp[_n-1]
by stateno: gen proptaxpgspL=proptaxpgsp[_n-1]
by stateno: gen salestaxpgspL=salestaxpgsp[_n-1]
by stateno: gen salesselectpgspL=salesselectpgsp[_n-1]
by stateno: gen inctaxpgspL=inctaxpgsp[_n-1]
by stateno: gen corptaxpgspL=corptaxpgsp[_n-1]
by stateno: gen exppgspL=exppgsp[_n-1]
by stateno: gen genexppgspL=genexppgsp[_n-1]
by stateno: gen surpluspgspL=surpluspgsp[_n-1]
by stateno: gen gensurpluspgspL=gensurpluspgsp[_n-1]
by stateno: gen debtpgspL=debtpgsp[_n-1]
by stateno: gen legpgspL=legpgsp[_n-1]
append using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"

*The following creates difference variables.
gen difreal1incpc=real1incpc-real1incpcL
gen difreal1gsppc=real1gsppc-real1gsppcL
gen difreal1revpc=real1revpc-real1revpcL
gen difreal1revnofedpc=real1revnofedpc-real1revnofedpcL
gen difreal1genrevpc=real1genrevpc-real1genrevpcL
gen difreal1taxpc=real1taxpc-real1taxpcL
gen difreal1proptaxpc=real1proptaxpc-real1proptaxpcL
gen difreal1salestaxpc=real1salestaxpc-real1salestaxpcL
gen difreal1salesselectpc=real1salesselectpc-real1salesselectpcL
gen difreal1inctaxpc=real1inctaxpc-real1inctaxpcL
gen difreal1corptaxpc=real1corptaxpc-real1corptaxpcL
gen difreal1exppc=real1exppc-real1exppcL
gen difreal1genexppc=real1genexppc-real1genexppcL
gen difreal1surpluspc=real1surpluspc-real1surpluspcL
gen difreal1gensurpluspc=real1gensurpluspc-real1gensurpluspcL
gen difreal1debtpc=real1debtpc-real1debtpcL
gen difreal1legpc=real1legpc-real1legpcL
gen difreal2incpc=real2incpc-real2incpcL
gen difreal2gsppc=real2gsppc-real2gsppcL
gen difreal2revpc=real2revpc-real2revpcL
gen difreal2revnofedpc=real2revnofedpc-real2revnofedpcL
gen difreal2genrevpc=real2genrevpc-real2genrevpcL
gen difreal2taxpc=real2taxpc-real2taxpcL
gen difreal2proptaxpc=real2proptaxpc-real2proptaxpcL
gen difreal2salestaxpc=real2salestaxpc-real2salestaxpcL
gen difreal2salesselectpc=real2salesselectpc-real2salesselectpcL
gen difreal2inctaxpc=real2inctaxpc-real2inctaxpcL
gen difreal2corptaxpc=real2corptaxpc-real2corptaxpcL
gen difreal2exppc=real2exppc-real2exppcL
gen difreal2genexppc=real2genexppc-real2genexppcL
gen difreal2surpluspc=real2surpluspc-real2surpluspcL
gen difreal2gensurpluspc=real2gensurpluspc-real2gensurpluspcL
gen difreal2debtpc=real2debtpc-real2debtpcL
gen difreal2legpc=real2legpc-real2legpcL
gen difrevpinc=revpinc-revpincL
gen difrevnofedpinc=revnofedpinc-revnofedpincL
gen difgenrevpinc=genrevpinc-genrevpincL
gen diftaxpinc=taxpinc-taxpincL
gen difproptaxpinc=proptaxpinc-proptaxpincL
gen difsalestaxpinc=salestaxpinc-salestaxpincL
gen difsalesselectpinc=salesselectpinc-salesselectpincL
gen difinctaxpinc=inctaxpinc-inctaxpincL
gen difcorptaxpinc=corptaxpinc-corptaxpincL
gen difexppinc=exppinc-exppincL
gen difgenexppinc=genexppinc-genexppincL
gen difsurpluspinc=surpluspinc-surpluspincL
gen difgensurpluspinc=gensurpluspinc-gensurpluspincL
gen difdebtpinc=debtpinc-debtpincL
gen diflegpinc=legpinc-legpincL
gen difrevpgsp=revpgsp-revpgspL
gen difrevnofedpgsp=revnofedpgsp-revnofedpgspL
gen difgenrevpgsp=genrevpgsp-genrevpgspL
gen diftaxpgsp=taxpgsp-taxpgspL
gen difproptaxpgsp=proptaxpgsp-proptaxpgspL
gen difsalestaxpgsp=salestaxpgsp-salestaxpgspL
gen difsalesselectpgsp=salesselectpgsp-salesselectpgspL
gen difinctaxpgsp=inctaxpgsp-inctaxpgspL
gen difcorptaxpgsp=corptaxpgsp-corptaxpgspL
gen difexppgsp=exppgsp-exppgspL
gen difgenexppgsp=genexppgsp-genexppgspL
gen difsurpluspgsp=surpluspgsp-surpluspgspL
gen difgensurpluspgsp=gensurpluspgsp-gensurpluspgspL
gen difdebtpgsp=debtpgsp-debtpgspL
gen diflegpgsp=legpgsp-legpgspL

*The following computes percent change variables.  
gen chgreal1incpc=((real1incpc-real1incpcL)/real1incpcL)*100
gen chgreal1gsppc=((real1gsppc-real1gsppcL)/real1gsppcL)*100
gen chgreal1revpc=((real1revpc-real1revpcL)/real1revpcL)*100
gen chgreal1revnofedpc=((real1revnofedpc-real1revnofedpcL)/real1revnofedpcL)*100
gen chgreal1genrevpc=((real1genrevpc-real1genrevpcL)/real1genrevpcL)*100
gen chgreal1taxpc=((real1taxpc-real1taxpcL)/real1taxpcL)*100
gen chgreal1proptaxpc=((real1proptaxpc-real1proptaxpcL)/real1proptaxpcL)*100
gen chgreal1salestaxpc=((real1salestaxpc-real1salestaxpcL)/real1salestaxpcL)*100
gen chgreal1salesselectpc=((real1salesselectpc-real1salesselectpcL)/real1salesselectpcL)*100
gen chgreal1inctaxpc=((real1inctaxpc-real1inctaxpcL)/real1inctaxpcL)*100
gen chgreal1corptaxpc=((real1corptaxpc-real1corptaxpcL)/real1corptaxpcL)*100
gen chgreal1exppc=((real1exppc-real1exppcL)/real1exppcL)*100
gen chgreal1genexppc=((real1genexppc-real1genexppcL)/real1genexppcL)*100
gen chgreal1surpluspc=((real1surpluspc-real1surpluspcL)/real1surpluspcL)*100
gen chgreal1gensurpluspc=((real1gensurpluspc-real1gensurpluspcL)/real1gensurpluspcL)*100
gen chgreal1debtpc=((real1debtpc-real1debtpcL)/real1debtpcL)*100
gen chgreal1legpc=((real1legpc-real1legpcL)/real1legpcL)*100
gen chgreal2incpc=((real2incpc-real2incpcL)/real2incpcL)*100
gen chgreal2gsppc=((real2gsppc-real2gsppcL)/real2gsppcL)*100
gen chgreal2revpc=((real2revpc-real2revpcL)/real2revpcL)*100
gen chgreal2revnofedpc=((real2revnofedpc-real2revnofedpcL)/real2revnofedpcL)*100
gen chgreal2genrevpc=((real2genrevpc-real2genrevpcL)/real2genrevpcL)*100
gen chgreal2taxpc=((real2taxpc-real2taxpcL)/real2taxpcL)*100
gen chgreal2proptaxpc=((real2proptaxpc-real2proptaxpcL)/real2proptaxpcL)*100
gen chgreal2salestaxpc=((real2salestaxpc-real2salestaxpcL)/real2salestaxpcL)*100
gen chgreal2salesselectpc=((real2salesselectpc-real2salesselectpcL)/real2salesselectpcL)*100
gen chgreal2inctaxpc=((real2inctaxpc-real2inctaxpcL)/real2inctaxpcL)*100
gen chgreal2corptaxpc=((real2corptaxpc-real2corptaxpcL)/real2corptaxpcL)*100
gen chgreal2exppc=((real2exppc-real2exppcL)/real2exppcL)*100
gen chgreal2genexppc=((real2genexppc-real2genexppcL)/real2genexppcL)*100
gen chgreal2surpluspc=((real2surpluspc-real2surpluspcL)/real2surpluspcL)*100
gen chgreal2gensurpluspc=((real2gensurpluspc-real2gensurpluspcL)/real2gensurpluspcL)*100
gen chgreal2debtpc=((real2debtpc-real2debtpcL)/real2debtpcL)*100
gen chgreal2legpc=((real2legpc-real2legpcL)/real2legpcL)*100
gen chgrevpinc=((revpinc-revpincL)/revpincL)*100
gen chgrevnofedpinc=((revnofedpinc-revnofedpincL)/revnofedpincL)*100
gen chggenrevpinc=((genrevpinc-genrevpincL)/genrevpincL)*100
gen chgtaxpinc=((taxpinc-taxpincL)/taxpincL)*100
gen chgproptaxpinc=((proptaxpinc-proptaxpincL)/proptaxpincL)*100
gen chgsalestaxpinc=((salestaxpinc-salestaxpincL)/salestaxpincL)*100
gen chgsalesselectpinc=((salesselectpinc-salesselectpincL)/salesselectpincL)*100
gen chginctaxpinc=((inctaxpinc-inctaxpincL)/inctaxpincL)*100
gen chgcorptaxpinc=((corptaxpinc-corptaxpincL)/corptaxpincL)*100
gen chgexppinc=((exppinc-exppincL)/exppincL)*100
gen chggenexppinc=((genexppinc-genexppincL)/genexppincL)*100
gen chgsurpluspinc=((surpluspinc-surpluspincL)/surpluspincL)*100
gen chggensurpluspinc=((gensurpluspinc-gensurpluspincL)/gensurpluspincL)*100
gen chgdebtpinc=((debtpinc-debtpincL)/debtpincL)*100
gen chglegpinc=((legpinc-legpincL)/legpincL)*100
gen chgrevpgsp=((revpgsp-revpgspL)/revpgspL)*100
gen chgrevnofedpgsp=((revnofedpgsp-revnofedpgspL)/revnofedpgspL)*100
gen chggenrevpgsp=((genrevpgsp-genrevpgspL)/genrevpgspL)*100
gen chgtaxpgsp=((taxpgsp-taxpgspL)/taxpgspL)*100
gen chgproptaxpgsp=((proptaxpgsp-proptaxpgspL)/proptaxpgspL)*100
gen chgsalestaxpgsp=((salestaxpgsp-salestaxpgspL)/salestaxpgspL)*100
gen chgsalesselectpgsp=((salesselectpgsp-salesselectpgspL)/salesselectpgspL)*100
gen chginctaxpgsp=((inctaxpgsp-inctaxpgspL)/inctaxpgspL)*100
gen chgcorptaxpgsp=((corptaxpgsp-corptaxpgspL)/corptaxpgspL)*100
gen chgexppgsp=((exppgsp-exppgspL)/exppgspL)*100
gen chggenexppgsp=((genexppgsp-genexppgspL)/genexppgspL)*100
gen chgsurpluspgsp=((surpluspgsp-surpluspgspL)/surpluspgspL)*100
gen chggensurpluspgsp=((gensurpluspgsp-gensurpluspgspL)/gensurpluspgspL)*100
gen chgdebtpgsp=((debtpgsp-debtpgspL)/debtpgspL)*100
gen chglegpgsp=((legpgsp-legpgspL)/legpgspL)*100

save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\081StateEconAndGovFinancesForUse.dta", replace

sort stateno year quar
outsheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.csv", comma replace
clear

*NOW CHECK WORK WITH FILE 032, LAST EDITIONS STATE GOV FINANCES DATA

clear
insheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\032stategovfinances.csv", comma
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\077BaseDataset.dta", clear
merge 1:1 stateno year quar using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta"
*merge perfect
drop _merge
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", replace
drop in 1/l
outsheet using "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.csv", comma replace
clear

clear
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\temp.dta", clear
gen popdif=(pop-pop_quar)/pop
gen popadif=(pop_a-pop_annual)/pop_a
gen cpinatdif=(cpinat-cpi_bls_quar_nat)/cpinat
gen cpiregdif=(cpireg-cpi_bls_quar_regional)/cpireg
gen housedif=(house-housing_prices_quar)/house
gen incdif=(inc-personal_income1000s_quar)/inc
gen incadif=(inc_a-personal_income1000s_annual)/inc_a
gen disincdif=(disinc-disposable_personal_income1000s_)/disinc
gen gspnaicsadif=(gsp_a-gsp_naics_ann)/gsp_a
gen gspsicadif=(gsp_a-gsp_sic_ann)/gsp_a
gen gspnaicsdif=(gsp-gsp_naics_q)/gsp
gen gspsicdif=(gsp-gsp_sic_q)/gsp
gen revdif=(rev-total_revenue)/rev
gen genrevdif=(genrev-general_revenue)/genrev
gen taxdif=(tax-taxes)/tax
gen expdif=(exp-total_expenditure)/exp
gen genexpdif=(genexp-general_expenditure)/genexp
gen debtdif=(debt-total_debt_outstanding)/debt
gen legdif=(leg-leg_tot_exp1)/leg
recode popdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode popadif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode cpinatdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode cpiregdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode housedif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode incdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode incadif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode disincdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode gspnaicsadif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode gspsicadif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode gspnaicsdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode gspsicdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode revdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode genrevdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode taxdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode expdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode genexpdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode debtdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
recode legdif (-99/-.05=-3) (-.05/-.01=-2) (-.01/-.0000001=-1) (-.0000001/.0000001=0) (.0000001/.01=1) (.01/.05=2) (.05/99=3)
*The following stataements apply to the comparison made on the row above it.  
tab quar popdif
*when quar!=2.5, roughly the same.  quar=2.5 should be different.
tab quar popadif
*roughly the same
tab quar cpinatdif
*not the same, but difs almost always less than 1%.  
*perhaps this is because they have a different year that is the base year.  
sum cpinat cpi_bls_quar_nat if cpinat!=.&cpi_bls_quar_nat!=.
*No, there means are about the same.
reg cpinat cpi_bls_quar_nat
*R2=1.0000, coef=.9999871, constant=.0001274.  
scatter cpinat cpi_bls_quar_nat
*They look like a straight line, they are essentially the same.  
tab quar cpiregdif
*differences almost always less than 1%.
*perhaps this is because they have a different year that is the base year.  tab quar housedif
sum cpireg cpi_bls_quar_regional if cpireg!=.&cpi_bls_quar_regional!=.
*Their means are about the same, so that doesn't seem to be the case.  
reg cpireg cpi_bls_quar_regional
*R2=1.0000, coef=.995136, constant=.0000401
scatter cpireg cpi_bls_quar_regional
*This is a straight line.  They are essentially the same thing.
tab quar incdif
*roughly the same
tab quar incadif
*identical
tab quar disincdif
*different
scatter disinc disposable_personal_income1000s_
reg disinc disposable_personal_income1000s_
*The above scatterplot and regression makes these look like they are essentially the same variable.  
tab quar gspnaicsadif
*different
reg gsp_a gsp_naics_ann
scatter gsp_a gsp_naics_ann
*The unit of measurement is completely different, but r2=1.0000.  This is why the difference between the two variables was so off.  
*These are essentially the same variable.
tab quar gspsicadif
*different
reg gsp_a gsp_sic_ann
scatter gsp_a gsp_sic_ann
*The unit of measurement is completely different, but r2=1.0000.  This is why the difference between the two variables was so off.  
*These are essentially the same variable.
tab quar gspnaicsdif
*different
reg gsp gsp_naics_q
scatter gsp gsp_naics_q
*some differences, but r2=.9999, essentially the same variable.  
*There were some slight differences in the construction of these variables, 
*so it isn't surprising that there are slight differences.
tab quar gspsicdif
*different
reg gsp gsp_sic_q
scatter gsp gsp_sic_q
*some differences, but r2=.9999, essentially the same variable.  
*There were some slight differences in the construction of these variables, 
*so it isn't surprising that there are slight differences.
tab quar revdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar genrevdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar taxdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar expdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar genexpdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar debtdif
*roughly the same.  There should be some differences, given updates, etc.
tab quar legdif
*much different
*The two variables being compared appear to have much different units of measurement.  
reg leg leg_tot_exp1
scatter leg leg_tot_exp1
*R2=1.0000, obviously totally different units of measurement.  
clear
*COMPARISONS INDICATE THAT NOTHING MESSED UP OCCURRED IN CONSTRUCTING THE NEW DATASET.  

*A LITTLE MORE CLEANUP OF UNNECESSARY VARIABLES IN THE FINAL FILE.
use "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\081StateEconAndGovFinancesForUse.dta", clear
drop  nationdum isastate
drop isaquar
tab year if  fyirregular!=.
list stateno year quar if year>2013& fyirregular==.
*fyirregular is system missing for DC in q4 2014, 2015 and 2016.
tab fyirregular if stateno==8.5
replace fyirregular=1 if stateno==8.5&year>2013
*there should have been 11 changes, there were only 11 changes.
save "C:\Users\Carl\Documents\aaa_Overfile20150831\15StatePolitics\Data_Econ_StateGovFinances\081StateEconAndGovFinancesForUse.dta", replace

